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1.  Introduction 


Reducible  flow  graphs  (rfg’s)  are  graphs  that  model  die  control  structure  of  computer  pro¬ 
grams.  They  are  used  extensively  in  problems  on  code  optimization  and  global  data  flow  analysis. 
Several  linear  time  sequential  algorithms  for  these  graphs  are  known,  including  algorithms  for 
recognizing  rfg’s  [Tal,  GaTa],  for  finding  dominators  [Ha]  and  for  finding  a  minimum  feedback 
vertex  set  (FVS)  [Sh],  The  basis  for  all  of  these  fast  sequential  algorithms  is  a  depth  first  search  on 
the  input  graph.  Recently,  we  have  developed  polynomial-time  algorithms  for  finding  a  minimum 
weight  FVS  in  vertex-weighted  rfg’s  and  a  minimum  feedback  arc  set  (FAS)  in  arc-weighted  or 
unweighted  rfg’s  [Ral].  These  algorithms  make  extensive  use  of  algorithms  for  network  flow 
[FoFu,  La,  PaSt,  Ta2,  GoTa].  It  is  also  known  that  the  sequential  complexity  of  these  latter  prob¬ 
lems  is  at  least  that  of  finding  a  minimum  cut  in  a  flow  network  [Ral,  Ra2], 

In  this  paper  we  give  parallel  NC  algorithms  for  recognizing  rfg’s.  for  finding  dominaton, 
and  for  finding  a  minimum  FVS  in  an  unweighted  rfg.  We  note  foal  foe  problem  of  finding  a 
minimum  FVS  in  cyclically  reducible  graphs,  a  class  closely  related  to  rfg’s,  is  reported  to  be  P- 
complete  in  [BoDAPe].  We  also  give  an  NC  algorithm  for  finding  a  depth  first  search  (DFS) 
numbering  for  an  rfg;  however,  none  of  our  other  parallel  algorithms  make  use  of  this  DFS 
numbering.  The  processor  bound  for  all  of  these  NC  algorithms  is  foe  number  of  processors  need 
by  an  NC  algorithm  to  multiply  two  nxn  matrices.  This  bound  is  good  with  respect  to  current 
NC  algorithms  for  directed  graphs,  since  most  of  these  algorithms  require  this  number  of  proces¬ 
sors. 

We  show  that  if  arbitrary  weights  are  allowed,  the  weighted  FAS  and  FVS  problems  on 
tfg’s  are  both  P-comptete.  Hence  fast  parallel  algorithms  for  these  problems  appear  unlikely  to 
exist  For  foe  case  when  foe  weights  are  in  unary,  we  present  an  RNC  algorithm  for  foe  FAS 
problem  on  rfg’s.  We  also  give  NC  reductions  between  foe  weighted  FAS  problem,  foe 
unweighted  FAS  problem,  the  weighted  FVS  problem  and  foe  problem  of  finding  a  minimum  cut 
in  a  flow  network  (when  weights  and  capacities  are  in  unary).  Thus  if  any  one  of  these  problems 
is  in  NC,  then  all  of  them  would  be  in  NC  In  particular,  an  NC  algorithm  for  foe  maximum 
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matching  problem  would  give  NC  algorithms  for  these  three  problems  on  rfg’s,  and  an  NC  algo¬ 
rithm  for  any  one  of  these  three  problems  would,  in  turn,  give  an  NC  algorithm  for  maximum 
matching. 

A  preliminary  version  of  this  paper  appeared  in  [Ra3].  Some  of  the  NC  algorithms  in  the 
present  paper  use  a  smaller  number  of  processors  than  the  corresponding  ones  in  [Ra3], 

This  paper  is  organized  as  follows.  In  section  2  we  provide  definitions.  In  section  3,  we 
present  our  parallel  algorithms  for  preprocessing  an  rfg.  In  section  4  we  present  a  parallel  algo¬ 
rithm  for  finding  a  minimum  FVS  in  an  unweighted  rfg.  Finally  in  section  5  we  give  an  RNC 
algorithm  for  finding  a  minimum  FAS  in  an  unweighted  rfg,  and  present  P-completeness  results 
for  foe  weighted  FAS  and  FVS  problems  on  rfg’s. 


2.  Definitions 


2.1.  Model  of  Parallel  Computation 

The  parallel  model  of  computation  that  we  will  be  using  is  foe  PRAM  model,  which  consists 
of  several  independent  sequential  processors,  each  with  its  own  private  memory,  communicating 
with  one  another  through  a  global  memory.  In  one  unit  of  time,  each  processor  can  read  one  glo¬ 
bal  or  local  memory,  execute  a  single  RAM  operation,  and  write  into  one  global  or  local  memory 
location. 

PRAMs  are  classified  according  to  restrictions  on  global  memory  access.  AnEREWPRAM 
is  a  PRAM  for  which  simultaneous  access  to  any  memory  location  by  different  processors  is  for¬ 
bidden  for  both  reading  and  writing.  In  a  CREW  PRAM  simultaneous  reads  are  allowed  but  no 
simultaneous  writes.  A  CRCW  PRAM  allows  simultaneous  reads  an d  writes.  In  this  case  we  have 
to  specify  how  to  resolve  write  conflicts.  We  will  use  the  COMMON  model  in  which  all  proces¬ 
sors  participating  in  a  concurrent  write  must  write  the  same  value.  Of  the  three  PRAM  models  we 
have  listed,  foe  EREW  model  is  the  most  restrictive,  and  foe  COMMON  CRCW  model  is  foe 
most  powerful.  It  is  not  difficult  to  see  that  any  algorithm  for  the  COMMON  CRCW  PRAM  that 
runs  in  parallel  time  T  using  P  processors  cm  be  simulated  by  an  EREW  PRAM  (and  hence  by  a 
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CREW  PRAM)  in  parallel  time  TlogP  using  the  same  number  of  processors,  P. 


Define  poly  log  (n  Qog* n  )■  The  class  NC  is  the  class  of  problems  solvable  in 

poly  log  (n  )  parallel  time  with  a  number  of  processors  polynomial  in  n ,  where  n  is  the  size  of  the 
input  This  class  is  generally  accepted  to  characterize  die  class  of  problems  that  can  be  solved 
feasibly  in  parallel. 

The  class  P  is  the  class  of  problems  solvable  by  a  sequential  algorithm  running  in  polyno¬ 
mial  time.  A  problem  is  P-complete  if  every  problem  in  P  can  be  reduced  to  it  in  logspace.  A  P- 
complete  problem  is  in  NC  if  and  only  if  NOP.  Since  it  is  widely  conjectured  that  NC  is  a 
proper  subset  of  P,  showing  a  problem  to  be  P-complete  is  strong  evidence  that  die  problem  is  not 
inNC. 

For  problems  in  NC,  we  would  like  to  develop  algorithms  that  run  in  polylog  parallel  time 
and  also  use  a  small  number  of  processors.  For  undirected  graphs  there  are  algorithms  known  for 
several  problems  that  run  in  polylog  time  using  a  linear  number  of  processors  (or  less)  on  a 
PRAM;  these  problems  include  graph  connectivity,  biconnectivity  and  triconnectivity,  s-t 
numbering,  planarity,  etc.  For  directed  graphs  unfortunately,  such  efficient  parallel  algorithms 
are  not  known  mainly  due  to  the  transitive  closure  bottleneck.  The  best  parallel  method  known  at 
present  to  test  reachability  from  one  vertex  to  another  in  a  directed  graph  is  to  find  die  transitive 
closure  of  the  adjacency  matrix  of  die  graph.  To  compute  this  in  polylog  time  requires  na  proces¬ 
sors  (to  within  a  polylog  factor),  where  a  is  the  matrix  multiplication  exponent,  which  is  currently 
2.375  (but  for  practical  computations  should  be  taken  as  3).  Thus,  since  rfg’s  are  directed  graphs, 
all  of  the  algorithms  we  present  in  this  paper  are  affected  by  die  transitive  closure  bottleneck. 

The  algorithms  we  develop  in  this  paper  make  use  of  some  well-known  baric  parallel  algo¬ 
rithms  as  subroutines.  We  conclude  this  section  with  a  brief  review  of  these  algorithms.  For  more 
on  the  PRAM  model  and  PRAM  algorithms  see  [KarRa]. 

1.  Boolean  matrix  multiplication  and  transitive  closure:  The  standard  matrix  multiplication  algo¬ 
rithm  can  be  parallelized  to  give  a  constant  time  algorithm  using  n 3  processors  on  a  COMMON 
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PRAM  to  find  the  product  of  two  nxn  Boolean  matrices.  Since  (I+By*,  for  m2n,  gives  the  tran¬ 
sitive  closure  B*  of  an  nxn  Boolean  matrix  fi.fi*  can  be  computed  using  logn  stages  of 
Boolean  matrix  multiplication  by  repeated  squaring  and  thus  in  OQogn)  time  on  a  COMMON 
PRAM  with  n3  processors.  The  more  sophisticated  matrix  multiplication  algorithms  (that  work 
for  matrices  over  a  ring,  and  can  be  adapted  to  Boolean  matrix  multiplication)  lend  themselves  to 
parallelization  on  an  EREW  or  CREW  PRAM.  Thus  multiplication  of  two  nxn  Boolean  matrices 
can  be  done  in  O  (log/t )  time  with  M(n)=na  processors  on  an  EREW  PRAM,  and  hence  fi*  can 
be  obtained  in  0Qog2n)  time  using  M  (n)  processors  on  an  EREW  PRAM. 

In  many  of  the  algorithms  we  present,  the  processor  count  is  dominated  by  the  number  of 
processors  needed  to  multiply  two  nxn  Boolean  matrices.  The  steps  that  do  not  require  matrix 
multiplication  typically  need  Oin2)  processors.  In  such  cases,  we  will  state  our  processor-time 
bound  as  parallel  time  t(n)  using  Q(n)  processors.  This  will  imply  that  foe  algorithm  runs  in 
time  t(n)  with  0(n3)  processors  on  a  COMMON  PRAM,  and  it  runs  in  time  0(f(ft)-logn)  with 
0(n ®)  processors  on  an  EREW  or  CREW  PRAM.  Any  future  improvement  in  foe  processor 
count  for  matrix  multiplication  on  any  of  these  types  of  PRAM  will  cause  a  corresponding 
improvement  in  foe  processor  bound  for  foe  algorithm,  since  foe  remaining  steps  need  O  (n2)  pro¬ 
cessors. 

2.  Prefix  sums :  Let  +  be  an  associative  operation  over  a  domain  D .  Given  an  ordered  list 
<*t,  •  •  •  Jt»>  of  n  elements  from  D ,  foe  prefix  problem  is  to  compute  foe  n-1  prefix  sums 

Si-tfjJ-l,  •••,«.  This  problem  has  several  applications.  For  example,  consider  foe  problem 

of  compacting  a  sparse  array,  i.e.,  we  are  given  an  array  of  n  elements,  many  of  which  are  zero, 
and  we  wish  to  generate  a  new  array  containing  foe  nonzero  elements  in  their  original  order.  We 
can  compute  the  position  of  each  nonzero  element  in  foe  new  array  by  assigning  value  1  to  foe 
nonzero  elements,  and  computing  prefix  sums  with  +  operating  as  regular  addition. 

The  n  element  prefix  sums  problem  can  be  computed  in  OQogn)  time  using  nflogn  pro¬ 


cessors  on  a  EREW  PRAM,  assuming  unit  time  for  a  single  +  operation. 
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3.  List  Ranking:  This  is  a  generalization  of  prefix  sums,  in  which  the  ordered  list  is  given  in  the 
form  of  a  linked  list  rather  than  an  array.  List  ranking  on  n  elements  can  be  computed  by  a  sim¬ 
ple  algorithm  in  OQagn)  time  using  n  processors  on  an  EREW  PRAM;  more  elaborate  algo¬ 
rithms  for  the  problem  run  in  0  (logn)  time  using  n/logn  processors. 

4.  Tree  contraction :  Tree  contraction  is  a  method  of  evaluating  tree  functions  efficiently  in  paral-  I 

leL  The  method  transforms  the  input  tree  using  two  operations  Rake  and  Compress.  The  opera¬ 
tion  Rake  removes  leaves  from  the  tree.  The  operation  Compress  halves  the  lengths  of  chains  in 
the  tree  (a  chain  is  a  sequence  of  vertices  with  exactly  one  incoming  and  outgoing  are  in  the  tree) 
by  removing  alternate  nodes  in  the  chain  and  linking  each  remaining  node  to  the  parent  of  its 
parent  in  the  original  tree.  The  Contract  operation  is  one  application  of  Rake  followed  by  one 
application  of  Compress.  It  can  be  shown  [MiRe]  that  0(logn)  applications  of  the  Contract 
operation  to  an  n  node  tree  are  sufficient  to  transform  the  tree  into  a  single  vertex.  This  contrac¬ 
tion  can  be  done  in  0(logn)  time  with  n  processors  on  a  CREW  PRAM.  More  elaborate  algo¬ 
rithms  for  this  problem  tun  in  O  (logn)  time  with  nAogn  processors  on  an  EREW  PRAM. 

Some  of  the  algorithms  we  will  present  in  this  paper  will  use  a  modified  tree  contraction 
method,  which  we  describe  at  the  end  of  section  3. 

2.2.  Graph-theoretic  Definitions 

A  directed  graph  G-(VA)  consists  of  a  finite  set  of  vertices  (or  nodes)  V  and  a  set  of  arcs 
A  which  is  a  subset  of  VXV .  An  arc  a=(vi,V2)  is  an  incoming  arc  to  V2  and  an  outgoing  arc 
from  V!.  Vertex  v  tisa  predecessor  of  v  2  and  v  2  is  a  successor  of  vj;  vi  is  the  tail  of  a  andv2is 
its  head.  Given  a  directed  graph  G  =(V  A )  and  a  set  of  arcs  C,  we  will  sometimes  use  the  nota¬ 
tion  C  p/jr  to  denote  the  set  C  f~\A .  An  arc-weighted  (vertex-weighted)  directed  graph  is  a 
directed  graph  with  a  real  value  mi  each  arc  (vertex). 

A  directed  path  p  in  G  from  vertex  u  to  vertex  v  is  a  sequence  of  arcs  a  1,  ■■■  a-  inA  such 
that  a,=<H',-,H'1+[),/=l,  ■■■  jr  with  wi*m  and  wr+}=«v .  The  path  p  passes  through  each 
W{J=l,  ■  •  •  s+1.  A  directed  pathp  from  u  to  v  is  a  cycle  if  u*v.  A  DAG  is  a  directed  acyclic 


graph,  Le.,  a  directed  graph  with  no  cycle.  A  rooted  directed  graph  or  a  flow  graph  G=(V \A,r)  is  a 
directed  graph  with  a  distinguished  vertex  r  such  that  there  is  a  directed  path  in  G  from  r  to 
every  vertex  v  in  V-{r } . 

A  rooted  tree  is  a  flow  graph  T=4YA /)  in  which  every  vertex  in  V-[r }  has  exactly  one 
incoming  arc.  If  (u,v)  is  the  unique  incoming  arc  to  v  then  u  is  the  parent  of  v,  and  v  is  a  child 
of  u.  A  lectf  is  a  vertex  in  a  tree  with  no  outgoing  arc.  The  height  of  a  vertex  v  in  a  tree  is  the 
length  of  a  longest  path  from  v  to  a  leaf.  The  height  of  a  tree  is  the  height  of  its  root  A  forest  is  a 
collection  of  trees. 

Let  G-(VA,r)  be  a  rooted  DAG.  A  vertex  u  is  a  descendant  of  vertex  v  if  either  «*v  or 
there  is  a  directed  path  from  v  to  u  in  G .  The  vertex  u  is  a  proper  descendant  of  v  if  u*v  and  u 
is  a  descendant  of  v. 

Let  G=(VA)  be  an  arc-weighted  directed  graph.  A  set  FcA  is  a  feedback  arc  set  (FAS) 
for  G  if  G'=(VA-F)  is  acyclic.  The  set  F  is  a  minimum  FAS  if  die  sum  of  the  weights  of  arcs 
in  F  is  minimum.  Analogous  definitions  hold  for  s  feedback  vertex  set. 

Let  G=(V  A)  be  a  directed  graph,  and  let  V'cV.  The  subgraph  of  G  induced  by  V'  is  die 
graph  Gm  (V XV*  A' ),  where  A’  V  XV*.  The  graph  G-V  is  the  subgraph  of  G  induced  on 

V-\T. 

A  reducible  (flaw)  graph  (or  rfg)  is  a  rooted  directed  graph  for  which  the  rooted  depth  first 
search  DAG  [Ta2]  is  unique.  Thus,  die  arcs  in  a  reducible  graph  can  be  partitioned  in  a  unique 
way  into  two  sets  as  die  DAG  or  forward  arcs  and  the  back  arcs. 

An  alternate  definition  of  a  reducible  graph  (due  to  [HeUl])  is  stated  below. 

Definition  2.1  [HeUl]  Let  G  =(V  A  /  )  be  a  flow  graph.  We  define  two  transformations  on  G : 

Transformation  7V  Given  an  arc  a  *<v  ,v )  in  A  remove  a  from  A . 

Transformation  7V  Let  v2  be  a  vertex  in  V-{r)  and  let  it  have  a  single  incoming  arc 

«*<v t,V2).  T2  replaces  vi,v2  and  a  by  a  single  vertex  v.  Predecessors  of  v\  become  prede¬ 
cessors  of  v.  Successors  of  V]  and  v2  become  successors  of  v.  There  is  an  »rc  (v.v)  if  and 
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only  if  there  was  fonnexiy  an  arc  (v2,v  0  or  (v  i,v  0. 

G  is  a  reducible  flow  graph  (rfg)  if  repeated  applications  of  T  i  and  T2  (in  any  order)  reduce  G  to 
a  single  vertex. 

Let  G  =<V  ,A  ,r )  be  a  reducible  graph  and  let  ft  *=(u  ,v )  be  a  back  arc  in  G .  Then  b  spans  ver¬ 
tex  w  (or  w  is  in  die  span  of  b)  if  there  exists  a  path  from  v  to  u  in  the  DAG  of  G  that  passes 
through  w.  Given  two  vertices  u.veV,  vertex  u  dominates  vertex  v  if  every  path  from  r  to  v 
passes  through  u  (note  that  u  dominates  itself). 

It  is  well-known  [AhUl]  that  die  dominator  relation  can  be  represented  in  the  form  of  a  tree 
rooted  at  r,  the  root  of  the  flow  graph  G.  This  tree  is  called  die  dominator  tree  7  of  G .  The  des¬ 
cendants  of  a  vertex  v  in  7  are  the  vertices  dominated  by  v  in  G .  A  vertex  v'  is  immediately 
dominated  by  v  if  it  is  a  child  of  v  in  7. 

Given  a  set  V'cV  the  dominator  forest  Fy  for  V  represents  the  dominator  relation  res¬ 
tricted  to  the  set  V\  Let  V*={ve  V I  v  is  the  head  of  a  bade  arc  in  G).  We  assume  that  r  is  die 
head  of  a  back  arc  in  G.  Hence  it  is  easy  to  see  that  Fvk  is  a  tree;  we  call  it  die  head  dominator 
tree  cfG  and  denote  it  by  7*.  This  tree  can  be  constructed  from  7  by  applying  transformation  72 
of  Definition  2.1  to  each  vertex  v  in  7  that  is  not  a  head  of  a  back  arc  in  G . 

3.  Parallel  Algorithms  for  Preprocessing  RFG ’a 

In  this  section  we  present  NC  algorithms  to  test  if  a  rooted  directed  graph  is  an  rfg,  to  con¬ 
struct  the  head  dominator  tree  for  an  rfg,  and  to  find  a  DFS  tree  in  an  rfg;  we  also  introduce  a 
modified  tree  contraction  method  in  this  section. 

3.1  Testing  flow  graph  reducibility : 

Input:  G=(V,A  ,r)  with  adjacency  matrix  B . 

1.  Test  if  G  is  a  flow  graph,  i.e.,  test  if  every  vertex  in  V-{r }  is  reachable  from  r . 

Form  S*.  the  transitive  closure  of  B  and  check  if  every  nondiagonal  element  in  row  r  has  a 
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2.  Construct  a  tree  7  rooted  at  r  using  the  algorithm  in  [GaMi]  to  find  a  directed  breadth-first 
search  tree.  Mark  all  arcs  in  7  as  forward  if)  arcs. 

3.  For  each  arc  (u,v)  in  G-7,  mark(u.v)  as  b  if  v  is  an  ancestor  of  u,  and  as/  otherwise. 

4.  Delete  all  arcs  marked  b  and  check  if  resulting  graph  C  is  acyclic.  (If  G  is  an  rfg  then  G' 
must  be  acyclic.) 

Form  transitive  closure  of  the  adjacency  matrix  &  of  G'  and  check  that  for  every  i<j£n, 
one  of  the  two  entries  in  positron  (<  J)  and  position  (jj)  'm  S'*  is  zero. 

3.  Compute  dominators  in  G*  using  the  algorithm  in  [PsGoRa]. 

6.  For  each  arc  (u.v)  marked  b  in  G,  check  if  v  dominates  u.G  is  a  rfg  if  and  only  if  v  dom¬ 
inates  u  for  all  arcs  (u  ,v)  marked  b . 

Lemma  3.1  Algorithm  3.1  correctly  determines  if  the  input  graph  is  an  rfg. 

Proof  If  G  is  an  rfg.  then  its  arcs  can  be  partitioned  in  a  unique  way  as  forward  and  bade  arcs, 
and  for  any  back  arc  b=(u  ,v),  vertex  v  must  dominate  vertex  w  [Helll].  Hence  the  tree  7  found 
instep  1  must  contain  only  forward  arcs  of  G.  Further,  if  an  arc  (ir.v)  not  in  7  is  a  back  arc  of  G , 
then  v  must  be  an  ancestor  of  a  in  7.  Further,  consider  any  arc  (u  ,v),  with  v  an  ancestor  of  u  in 

7.  Then  arc  (u.v)  completes  a  cycle  in  G ,  consisting  of  itself,  followed  by  the  path  in  7  from  v 
to  u .  One  of  these  arcs  must  be  a  bade  arc.  Since  all  of  the  arcs  in  7  are  forward  arcs,  it  follows 
that  (’i  ,v )  must  be  a  bade  arc.  Hence  steps  2  and  3  of  Algorithm  3. 1  correctly  identify  the  forward 
and  back  arcs  of  G  if  G  is  an  rfg. 

A  flow  graph  is  an  rfg  if  and  only  if  its  set  of  arcs  can  be  partitioned  into  two  sets  E\ and  E2 
such  that  £i  forms  an  acyclic  subgraph  D  ofG,  and  for  each  uKu.v)  in£2,  v  dominates  u  in  D 
[HeUl],  Thus  all  of  the  tests  in  steps  4, 3  and  6  are  satisfied  if  and  only  if  G  is  an  rfg.[] 

Steps  1, 2, 4  and  3  take  0(log*)  time  using  Q(n)  processors.  Steps  3  and  6  can  be  imple¬ 
mented  in  OQogn)  time  using  a  linear  number  of  processors  using  tree  contraction.  Hence  the 
complexity  of  this  algorithm  is  O  (Logit)  parallel  time  using  Q(n)  processors. 


32  Forming  Tk ,  the  head  dominator  tree  for  G : 


1.  Use  algorithm  3.1  to  construct  DAG  G . 

2.  Use  the  algorithm  in  [PaGoRa]  to  compute  the  dominator  tree  T  for  DAG  G . 

3.  Use  tree  contraction  to  extract  the  head  dominator  tree  Tk  from  T. 

Steps  1  and  2  take  0(iogn )  time  with  Q(n)  processors,  and  step  3  takes  OQogn)  time  with 
n  processors.  Hence  step  1  dominates  the  complexity  of  this  algorithm. 

33.  NC  algorithm  for  finding  a  DFS  numbering  for  anrfgG  =<V  A  ,r) 

1.  Use  algorithm  3.1  to  construct  the  DAG  G’ . 

2.  Find  a  DFS  tree  in  DAG  G'  as  follows. 

i)  Identify  a  vertex  v  with  more  than  n/2  descendents  for  which  every  child  has  at  most  n!2 
descendents: 

Find  the  transitive  closure  S'*  of  B' .  Determine  the  number  of  descendents  of  each  vertex 
as  the  sum  of  the  nondiagonal  entries  in  its  row  in  if* .  Each  arc  in  G  compares  the  number 
of  descendents  of  its  head  with  the  number  of  descendents  of  its  tail,  and  marks  its  tail  if  it 
is  not  the  case  that  the  head  has  at  most  nil  descendents  and  the  tail  has  more  than  n/2  des¬ 
cendents.  The  (unique)  unmarked  vertex  is  v . 

ii)  Find  a  path/*  from  root  r  to  v: 

Find  a  directed  spanning  tree  for  G'  by  making  each  vertex  with  an  incoming  arc  choose 
one  such  arc  as  its  tree  arc.  Form  P  as  the  path  in  this  tree  from  r  to  v . 

iii)  Associate  each  descendent  v'  of  v  with  the  largest  numbered  child  of  v  (numbering  accord¬ 
ing  to  some  fixed  order)  from  which  it  is  reachable;  associate  each  vertex  V  not  reachable 
from  v  with  tire  lowest  vertex  in  path  P  from  which  it  is  reachable: 

Use  list  ranking  to  number  the  vertices  on  P  in  increasing  order  from  the  root,  followed  by 
the  children  of  v  in  some  fixed  order.  Replace  all  nonzero  entries  in  the  columns  of  V* 


corresponding  to  these  nodes  by  their  new  number.  For  each  row,  find  the  maximum  num¬ 
bered  entry  in  that  row,  and  identify  it  as  the  vertex  with  which  the  row  vertex  is  to  be  asso¬ 
ciated. 

iv)  Recursively  solve  problem  in  subdags  rooted  at  the  newly  numbered  vertices,  together  with 
their  descendants  as  computed  in  step  iii. 

Lemma  3 2  Let  G=(V  A  ,r)  be  a  DAG,  with  I V I  .  There  exists  a  unique  vertex  u  e  V  with 
more  than  n  12  descendants  for  which  every  child  has  at  most  n  12  descendants. 

Proof  Straightforward,  and  is  omitted.  [] 

Lemma  3-3  Algorithm  3.3  correctly  finds  a  DFS  tree  in  an  rfg. 

Proof  We  observe  that  the  algorithm  constructs  a  DFS  tree  consisting  of  the  initial  path  P  to  v , 
followed  by  a  DFS  on  the  vertices  reachable  from  the  children  of  die  largest  numbered  child  of  v , 
followed  by  vertices  reachable  from  the  second  largest  numbered  child  of  v  (but  not  reachable 

from  the  largest  child  of  v) . followed  by  vertices  reachable  from  the  smallest  numbered  child 

of  v  (but  not  reachable  from  larger  numbered  children  of  v).  followed  by  vertices  reachable  from 
nodes  on  F-(v }  in  reverse  order  of  their  occurrence  on  P.  It  is  not  difficult  to  see  that  this  is  a 
valid  depth  first  search.  [] 

Step  2i  takes  OQogn)  time  using  Q(n)  processors.  Step  2ii  is  very  efficient:  it  takes  con¬ 
stant  time  using  a  linear  number  of  processors  on  an  EREW  PRAM.  Step  2iii  takes  O  (logn )  time 
using  n2  processors  on  an  EREW  PRAM.  Finally  the  recursive  steps  take  logn  stages  since  each 
new  subproblem  is  at  most  half  the  size  of  the  previous  problem;  further  the  sum  of  die  sizes  of 
the  new  problems  is  less  than  the  size  of  the  previous  problem  and  hence  the  processor  count  is 
dominated  by  the  first  stage.  Thus  the  algorithm  takes  O  (log 2n )  time  using  Q(n)  processors. 

Other  NC  algorithms  for  finding  a  DFST  in  a  DAG  are  known  [GhBh]. 

In  the  next  two  sections  we  present  parallel  algorithms  to  find  minimum  feedback  sets  in 
rfg’s.  Our  algorithms  require  computation  on  the  head  dominator  tree  Tk  of  the  input  rfg 


G=<V^4  ,r).  For  this  we  will  use  a  variant  of  tree  contraction.  We  conclude  this  section  with  a 
description  of  this  modified  tree  contraction  method. 

Recall  the  a  chain  in  a  directed  graph  G  is  a  path  <vi,  ■  ■  *  ,v*>  such  that  each  v,  has 
exactly  one  incoming  arc  and  one  outgoing  arc  in  G .  A  maximal  chain  is  one  that  cannot  be 
extended.  A  leaf  chain  <vj,  •  •  •  ,v/_j,v/>  in  a  rooted  tree  7XV  A  consists  of  a  maximal  chain 
<vi,  •  •  •  ,v/_i>,  with  v/  the  unique  child  of  v/_ i,  and  with  vt ,  a  leaf. 

The  two  tree  operations  we  use  in  our  modified  tree  contraction  method  are  Rake  and 
Shrink.  As  before,  fire  Rake  operation  removes  leaves  from  the  tree.  The  Shrink  operation  shrinks 
each  maximal  leaf  chain  in  the  current  tree  into  a  single  vertex. 

Lemma  3.4  In  the  modified  tree  contraction  method,  OQogn)  applications  of  Rake  followed  by 
Shrink,  suffice  to  transform  any  n  node  tree  into  a  single  vertex. 

Proof  Consider  another  modified  tree  contraction  algorithm  in  which  foe  Shrink  operation  shrinks 
all  maximal  chains,  including  leaf  chains,  into  a  single  vertex  (one  for  each  chain).  This 
modification  certainly  requires  no  more  steps  than  regular  tree  contraction,  and  hence  by  foe 
result  in  [MiRe],  transforms  my  n  node  tree  into  a  single  vertex  in  O  (log n )  time.  But  the  number 
of  applications  of  Rake  followed  by  Shrink  in  foe  above  modified  tree  contraction  method  is 
exactly  foe  same  as  that  in  our  modified  tree  contraction  method,  since  foe  only  difference  is  that 
a  chain  gets  shrunk  in  several  stages,  rather  than  all  at  once.[] 

In  our  algorithms  for  minimum  feedback  sets,  we  will  associate  appropriate  computation 
with  the  Rake  and  Shrink  operations  in  order  to  obtain  the  desired  result 

4.  NC  Algorithm  for  Finding  a  Minimum  FVS  in  an  Unweighted  Rfg 

We  first  review  foe  basic  ideas  in  Shamir’s  polynomial  time  sequential  algorithm  [Sh], 
Given  an  rfg  G*<V  A  /)  together  with  a  partial  FVS  S  for  G,  a  head  v  in  G  is  active  if  there  is  a 
DAG  path  from  v  to  a  corresponding  tail,  which  is  not  cut  by  vertices  in  5.  A  maximal  active 
head  v  is  an  active  head  such  that  none  of  its  proper  DAG  descendants  in  G  is  an  active  head. 
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The  following  theorem  is  established  in  [Sh]. 

Theorem  4.1  [Sh]  Let  G=(VAs)  be  an  rfg,  and  let  S  be  a  subset  of  a  minimum  FVS  in  G.  If  v 
is  a  maximal  active  head  in  G  with  respect  to  S,  then  S  ^j{v}  is  also  a  subset  of  a  minimum  FVS 
inG. 

Using  Theorem  4.1,  we  obtain  the  following  algorithm,  based  on  the  head  dominator  tree,  to 
construct  a  minimum  FVS  for  an  rfg. 

4.1  Minimum  FVS  Algorithm 

Input:  An  rfg  G  =<V  A  s)  together  with  its  head  dominator  tree  7* . 

Output:  A  sets  c,V  which  is  a  minimum  FVS  for  G . 

1.  Initialize  5  4-4. 

2.  Repeat 

a)  S  i—S  (_jL ,  where  L  is  the  set  of  leaves  in  Th . 

b)  G< — G—L,  T^i — Tk~L. 

c)  Find  U,  die  set  ofheads  in  current  G  that  are  not  active  inG. 

d)  For  each  vertex  v  not  in  U,  find  its  closest  proper  ancestor  w  in  7*  that  is  not  in  U,  and 
make  w  the  parent  of  v ;  remove  all  vertices  in  U  from  7*. 

until  Th=+. 

We  implement  the  above  tree  computations  using  our  modified  tree  contraction  method.  In 
order  to  obtain  a  processor-efficient  implementation  of  the  above  algorithm,  we  define  Tk,  the 
head-tail  dominator  tree  of  G.  For  each  vertex  u  inG  that  is  a  tail  of  a  back  arc  but  not  the  head 
of  any  back  arc  in  G,  we  set  parent  of  u  in  7*  to  be  v,  where  v  is  a  brad  of  a  back  arc  in  G ,  v 
dominates  u  in G ,  and  no  child  of  v  in  7*  dominates  m  inG.  7*  is  the  tree  obtained  from  7*  by 
including  these  tail  vertices  of  G  (all  of  which  will  be  leaves  in  7*).  7*  can  be  computed  from 
G  in  a  manner  similar  to  the  computation  of  7*  (Algorithm  3.2).  Our  parallel  algorithm  will 


perform  modified  tree  contraction  on  7*,  and  will  also  transform  Th  to  keep  track  of  the  current 
structure  of  G. 

The  computation  associated  with  a  Rake  step  is  exactly  one  application  of  step  2  of  the 
above  algorithm:  we  add  the  leaves  of  the  current  tree  Tk  to  S ,  and  delete  them  from  7* .  In  Tk 
we  delete  the  subtree  rooted  at  each  of  foe  vertices  we  deleted  from  7* .  We  then  determine  foe 
active  heads  in  the  current  G .  A  head  h  is  active  in  the  current  G  if  and  only  if  it  lies  in  7* ,  and 
at  least  one  of  its  corresponding  tails  lies  in  Th.  This  is  easily  determined  for  all  heads  in  con¬ 
stant  time  with  O(m)  processors  on  a  COMMON  PRAM  by  having  a  processor  for  each  back  arc 
(« ,v  ),  which  informs  vertex  v  if  u  if  in  7**.  The  set  U  is  then  determined  as  the  set  of  heads  that 
are  not  currently  active.  Each  vertex  in  T*  (7**)  which  is  not  in  G  finds  its  closest  proper  ancestor 
in  7*  (Tk)  that  is  not  in  {/.and  makes  it  its  new  parent  Vertices  in  U  are  then  deleted  from  both 
7*  and  7%.  This  computation  can  be  done  with  tree  contraction  and  takes  OQogn)  time  with 
0(rt)  processors  on  an  EREW  PRAM. 

Now  consider  the  operation  Shrink,  which  shrinks  each  maximal  leaf  chain  in  Th  into  a  sin¬ 
gle  vertex.  During  the  Shrink  operation  we  will  identify  foe  vertices  in  these  leaf  chains  that 
belong  to  S  based  on  foe  following  observation.  Let  C*<vj,  •  *  •  ,v*>  be  a  leaf  drain,  where  v,  is 
the  leaf.  Suppose  v,  is  in  the  minimum  FVS  S .  Then  the  largest  j  <i  such  that  vj  is  in  S  (if  such  a 
j  exists)  is  immediately  determined  as  the  largest  j<i  for  which  there  is  some  back  edge  (n.v,) 
such  that  v,  has  a  path  to  u  in  G-{v,}.  This  is  because  v,-  dominates  all  hence  any  cycle 

containing  vjJ<i ,  which  is  cut  by  a  vertex  v*  with  k>i  is  certainly  cut  by  v,-;  thus  Vj  win 
become  a  maximal  active  head  if  v,-  is  added  to  S. 

Our  computation  tor  foe  Shrink  operation  determines  for  each  v,-  in  C,  foe  largest  j<i  (if  it 
exists)  such  that  vy  is  in  S  if  v,-  is  in  S.  Then  for  each  i  tor  which  j  exists,  we  place  a  pointer 
from  Vj  to  Vj.  This  defines  a  finest  F  on  {vi,  •  •  •  ,v/ }.  Since  v/  is  a  leaf  hi  7*  it  belongs  to  S. 
Hence  foe  vertices  in  C  that  belong  to  5  are  precisely  those  from  which  v/  is  reachable  in  F.  We 
identify  these  vertices  using  regular  tree  contraction  and  add  them  to  S . 


We  now  descnbe  a  method  to  compute  vy  for  each  v,  .  Each  v*  determines  the  largest  t 
(iat)  such  that  v,-  dominates  all  of  the  corresponding  tails  of  v*.  and  (daces  this  value  in  the  4th 
location  of  an  array  A  [1../].  Each  vertex  v;  inspects  this  array  and  finds  the  largest  position  j 
(J<i)  such  that  j4[/]<1  .  Then  clearly  v;  is  the  unique  vertex  in  the  leaf  chain  such  that  vy  is  in  S 
if  V;  is  in  5. 

Each  v,-  can  determine  its  corresponding  vy  in  0(logi)  time  with  /  processors  on  an  EREW 
PRAM,  and  hence  this  computation  can  be  done  for  all  vertices  in  the  leaf  chain  in  0(logn)  time 
with  n2  processors.  (We  do  not  attempt  to  be  more  efficient  with  this  computation  since  the 
overall  algorithm  requires  Q(n)  processors  and  thus  an  O  (n2)  processor  bound  is  adequate  for 
our  needs.) 

By  Lemma  3.4, 0(log/i)  applications  of  Rake  and  Shrink  operations  suffice  to  contract  7* 
to  a  single  vertex,  and  at  this  point  we  will  have  constructed  a  minimum  FVS  S.  Thus  this  gives  a 
parallel  algorithm  to  find  a  minimum  FVS  in  an  rfg  in  OQogpn)  time  using  g(n)  processors.  A 
high-level  description  of  the  parallel  algorithm  is  given  below. 

42  Parallel  Minimum  FVS  Algorithm 
Input:  An  rfg  G*KV  A  s). 

Output:  A  setS cV  which  is  a  minimum  FVS  for  G . 

1.  Form  Th  and  7%;  initialize S <-4. 

2.  Repeat 

a)  Rake  leaves 

i)  Place  leaves  of  7*  in  S. 

ii)  In  7% ,  delete  the  subtree  rooted  tf  each  vertex  raked  in  step  i)  in  7* . 

iii)  For  each  vertex  v  in  7*.  determine  if  v  is  active  by  checking  if  any  one  of  its 
corresponding  tails  is  in  7% . 

iv)  Let  (/  be  the  set  of  vertices  in  7*  that  we  not  active  heads.  For  each  vertex  v  in  7*  (7%) 
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that  is  not  in  U,  find  its  closest  proper  ancestor  w  that  is  not  in  U  and  make  it  its  new 
parent  in?*  (7%). 

v)  Delete  vertices  in  U  from  7*  and  7% . 
b)  Shrink  leaf  chains 

Let  <vi,va,  •  •  •  ,v/>  be  the  leaf  chain,  with  vt  the  leaf. 

i)  For  each  v*  find  the  largest  i  (iik)  such  that  v,'  dominates  all  of  the  corresponding  tails 
of  vt,  and  place  this  value  in  location  k  of  array  A 11..*]. 

ii)  For  each  v,-  use  array  A  [lJfc]  to  find  the  largest  position  j  (J<i)  such  that  A  L/]<».  Place 
an  arc  from  v;  to  v;  in  an  auxiliary  graph  F  on  vertices  vi,  -  -  *  ,v/. 

iii)  Find  the  set  of  vertices  from  which  v/  is  reachable  inF  and  add  these  vertices  to  S . 

iv)  Delete  vertices  vi,  •  •  •  ,v/  from  7*  and  delete  die  subtrees  rooted  at  these  vertices  from 
7V 

v)  Perform  parts  iii,  iv  and  v  of  foe  Rake  step, 
i trail  Tk^. 

5.  Finding  a  Minimum  FAS  in  an  Unweighted  Rfg  and  Related  Problems 

We  first  state  some  definitions  and  results  from  [Ral],  which  gives  a  polynomial-time 
sequential  algorithm  for  finding  a  minimum  FAS  in  an  rfg.  We  then  give  an  RNC  algorithm  for 
this  problem  and  related  results. 

A  flow  network  GMYAx,t,C)  is  an  arc-weighted  directed  graph  with  vertex  set  V  and  arc 
set  A,  where  s  and  t  are  vertices  in  V  called  foe  source  and  sink  respectively,  and  C  is  the  opa¬ 
city  Junction  on  the  arcs  which  specifies  the  arc  weights,  which  are  always  nonnegative.  The 
maximum  flow  problem  asks  for  a  flow  of  maximum  value  from  s  to  t  (see  [Ev,  FoFu,  PaSt, 
Ta2]  for  definition  of  a  flow).  A  cut  C  separating  s  and  t  is  a  set  of  arcs  that  breaks  all  paths  from 
s  to  t. The  capacity  of  C  is  the  sum  of  foe  capacities  of  arcs  in  C.  A  minimum  cm  separating  s 
and  t  is  a  cut  of  minimum  capacity.  It  is  well-known  that  foe  value  of  a  maximum  flow  is  equal 


to  the  value  of  a  minimum  cut  [FoFu]. 

Let  GMY* 4  jr)  be  an  arc- weighted  reducible  graph  and  let  v  be  the  head  of  a  back  arc  in 
G .  Let  bisOii.vi),  ■  ■  -  J>rM.Ur,Vr)  be  the  back  arcs  in  G  whose  heads  are  dominated  by  v .  The 
dominated  back  arc  vertex  set  ofv  is  the  set  V,*/v'e  V I V  lies  cm  a  DAG  path  from  v  to  some 
•  •  •  jr).  It  is  easy  to  see  that  v  dominates  all  vertices  in  V,. 

Definition  5.1  Let  G*(V  A  jr)  be  an  arc-weighted  reducible  graph  with  nonnegative  arc  weights, 
and  let  v  be  the  head  of  a  back  arc  in  G .  For  convenience  of  notation  we  denote  G,(Vy),  the  sub¬ 
graph  of  G  induced  by  the  dominated  back  arc  vertex  set  of  v ,  by  G,  (v).  The  maximum  flow  net¬ 
work  ofG  with  respect  to  head  v  is  a  flow  network  GM(v)  formed  by  splitting  each  head  h  in 
G,(y)  into  h  and  K  (see  figure  1).  All  DAG  arcs  entering  or  leaving  the  original  head  h  will 
enter  or  leave  the  newly  formed  h\ all  bade  arcs  entering  die  original  h  will  enter  h’ .  There  will 
be  an  arc  of  infinite  capacity  from  H  to  a  new  vertex  r.  All  other  arcs  will  inherit  their  capacities 
from  their  weights  in  G.  We  will  interpret  v  as  the  source  and  t  as  the  sink  of  Gm  (v ). 


Back  area  are  la,*).  |b,», 
<«.•>.  (c.t)  and  td.r). 


figure  1 

Constructing  Gm(v)  from  G*(V,A  jr) 

Definition  5.2  Let  G«OM  /■)  be  an  arc- weighted  reducible  graph.  We  define  G^iv),  the  min- 


cost  maximum  flow  network  with  respect  to  head  v  inductively  as  follows: 

a.  If  v  dominates  no  other  head  in  G  then  C*(»H;*(v), 

b.  Let  vi,  *  ,vr  be  the  heads  immediately  dominated  by  v  in  G  and  let  the  capacity  of 

minimum  cut  in  G^,(v,  )  be  c,  U  35 1.  •  •*  s  ■  Then  (v  )=<  V  A )  where  V  is  the  same  as  the 

vertex  set  forGm(v)  and  A  a  (arcs  in  Gm(v)}  ^j(arcs  in  •  •  ■  jr)\jFv,  where 

Fv  *  {/vn“(v,Vi)U=l,  •  •  •  jr,  with  capacity  of/w,  equal  to  c,-  J. 

We  call  F„  the  mincost-arc  set  for  head  v ;  if  j  is  a  head  immediately  dominated  by  head  i  then 
fij  is  the  mincost  arc  from  head  i  to  head  j. 

Figure  2  gives  an  example  of  Gmm(v). 


f.-NI 


figure  2 

The  mincost  maximum  flow  network  with  respect  to  bead  v  for 
the  graph  G^YAs)  of  figure  1 

ft  is  established  in  [Ral]  that  following  algorithm  determines  the  cost  of  a  minimum  FAS  in 
an  are-weighted  reducible  graph  G.  If  G  is  an  unweighted  graph,  then  by  the  remit  in  [Ra4]  this 
vahie  also  gives  the  maximum  number  of  arc  disjoint  cycles  in  G . 


5.1  Minimum  FAS  Algorithm  for  Reductive  Flow  Graphs 


Input:  A  reducible  graph  G=(V  As)  with  nonnegative  weights  on  arcs. 
Output:  The  cost  of  a  minimum  FAS  for  G . 


begin 

1.  Prcprocess  G :  Label  the  heads  of  back  arcs  in  G  in  postonder.  Derive  the  head  dominator  tree 
7*  for  G .  Introduce  a  pointer  from  each  vertex  i  in  7*  (except  r)  to  its  parent  A,-.  Let  the  number 
ofheadsbefi. 

2. Forf*l,-  •  -  Ji  process  head  i : 

a.  Find  the  capacity  of  minimum  cut,  a,  in  Gm  (i ). 

b.  If  i*h  then  introduce  an  arc  of  weight  a  from  /»,  to  i  in  G .  (Note  that  G  changes  during 
the  execution  of  the  algorithm  so  that  Gm(i)  is  the  same  as  G^m  (i)  if  G  were  unchanged.) 

3.  Output  c*  as  cost  of  minimum  FAS  for  G . 


We  implement  the  above  tree  computations  once  again  using  our  modified  tree  contraction 
method.  For  a  Rake  step,  we  form  Gm(l),  for  each  leaf  l  in  7*  and  compute  c/t  the  capacity  of  a 
minimum  cut  in  Gm(l).  If  l*h  then  we  place  an  arc  of  capacity  q  from  hi  to  l .  Finally  we  delete 
all  leaves  from  the  current  7*. 

The  complexity  of  die  Rake  step  is  dominated  by  the  complexity  of  computing  minimum 
cuts  in  the  mincost  maximum  flow  netwoiks  associated  with  die  leaves  of  T* .  The  total  size  of  all 
of  these  networks  is  0(m+n),  where  m  and  n  are  the  number  of  arcs  and  vertices,  respectively, 
in  G .  Hence,  using  the  algorithm  in  [MuVaVa]  to  compute  minimum  cuts,  we  can  perform  the 
Rake  step  by  a  randomized  algorithm  that  runs  in  O  (log3/! )  parallel  time  with  0(m  n5J)  proces¬ 
sors  on  an  EREW  PRAM. 

The  Shrink  operation  is  a  little  more  involved.  We  assume  some  familiarity  with  the  results 
in  [Ral],  Let  C*<vi,  ■  •  •  ,v/  >  be  a  leaf  chain  in  7* ,  with  v?  the  leaf.  During  the  Shrink  operation, 
we  will  determine  the  capacity  of  a  minimum  cut  in  Gm(v,y*  1,  •••J.  We  can  then  use  this  to 

construct  to  Gm(viX  and  hence  dre  current  G. 


inr  .gvragjWVT 


Our  parallel  algorithm  will  process  Gm«(v i)  in  chunks.  For  this  we  develop  some  notation. 
Let  Gy ,  IZiZjSl  be  the  graph  G«k(v,)  with 

a)  All  arcs  dominated  by  vy  deleted  and  replaced  by  single  arc  of  infinite  capacity  fromvy  tor, 

b)  The  capacities  of  the  mincost  arcs  (v*_i,v*),k=f  +1,  •  •  •  J-l  set  to and 

c)  The  mincost  arc  (vy.i.vy)  deleted. 

The  graph  G‘J+i  will  denote  G„*(v,)  with  the  capacities  of  all  mincost  arcs 
(v*-i.v*)jk*/+l,  •  ■  •  J  set  to  «•.  We  will  denote  the  mincost  arc  from  vy_j  to  vy ,  j  >i  in  G^iyi) 
by  Cj  and  call  it  a  chain  mincost  arc. 

We  will  use  the  notation  n‘J  to  denote  the  value  of  a  minimum  cut  in  Gif,  and  m,  to 
denote  die  value  of  a  minimum  cut  in  G*m(yi).  We  define  if  i*/  and  mj* 0  if  i=/+l. 

Lemma  5.1  Let  M  be  a  minimum  cut  in  G^,(v;)  that  contains  a  chain  mincost  arc  e,  .  Then  vy_i 
is  separated  from  vy  by  M. 

Proof  Consider  die  vertex  partition  S\jT  induced  by  M,  where  S  is  the  set  of  vertices  in  the 
component  containing  v*  in  G-M .  If  both  vy_ \  and  v7  are  in  S  (or  T)  then  we  can  remove  Cj 
from  M  and  still  have  a  cut.  contradicting  the  fact  that  M  is  a  minimum  cut  If  vy_i  is  in  S  and  vy 
is  in  T,  then  vy_j  is  separated  from  vy  by  the  cut  as  required.  Finally,  vy  in  S  and  vy.j  in  T  is  not 
possible  since  every  path  from  v,-  to  v;  must  pass  through  vy_i.[] 

Lemma  5.2  Let  M  be  a  minimum  cut  in  G^v,).  Then  M  contaim  at  most  one  chain  mincost 
arc. 

Proqf  Suppose  M  contains  two  chain  mincost  arcs  ey  and  £*</<*.  By  Lemma  1,  vy_j  is 
separated  from  vy  by  M .  Hence  every  path  from  v,-  to  v*  is  cut  by  M-{tk }.  Thus  M-fa }  is  a  cut 
for  G^(v,)  contradicting  the  fact  that  3/  is  a  minimum  cut[] 

Lemma  53  Let  M  be  a  minimum  cut  in  G>»»(y*),  for some  k,  IZkZI,  and  let  Af  separate  v,  from 
v,_i  for  some  i>k.  Let  N  be  a  minimum  cut  in  Gif.  Then  Nvj{e,}  is  a  minimum  cut  for 
Gm *(v*),  nk*+mi  is  the  value  of  a  minimum  cut  hi  G** (yk),  and  N\jMt  is  a  minimum  FAS  for 
G,  (v* ),  where  Af,  is  a  minimum  FAS  fbrG,(v,  ). 
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ProaflJtX  G'mmt(yk)  be  G^iyt)  with  the  capacities  of  mincost  arcs  (vj_|,vy)J=4+l,*+2,  -  ,1-1 
increased  to  ~>.M  will  continue  to  be  a  minimum  cut  in  G'Hm(vk)  since  M  separates  Vj  from  v,_i 
in  Gm, (v*).  Since  M  is  a  minimum  cut,  none  of  the  arcs  in  At  are  dominated  by  v,  ,  since  M 
would  be  a  cut  even  if  all  such  arcs  are  deleted  from  it  Since  M  must  contain  we  can  write 
{jia},  where  Af*  contains  no  mincost  arc. 

Now  consider  N.  N  is  a  minimum  cut  in  Gi/ .  Since  all  of  the  mincost  arcs  in  G if  have 
infinite  capacity,  N  contains  no  mincost  arc,  and  every  arc  in  N  appears  in  G^Cv*).  N  must 
separate  v,_i  from  v,  in  Gif,  since  there  is  a  path  of  infinite  capacity  from  v*  to  v,_i  and  from  v, 
to  t  rnGif.  Hence  N^j{e,}  separates  v,-  from  t  in  G^v*).  Finally,  IN  ItSIAf  I  since  Af  is  a 
cut  for Gif .  Hence  N )  is  a  minimum  cut  forGim(yt). 

By  definition,  the  capacity  of  e,  is  the  capacity  of  a  minimum  cut  in  Cm,  (v,-),  which  is  m, . 
Hence  /»*'•* +mj  is  the  capacity  of  minimum  cut  in  GmrCv*). 

Finally,  since  N  has  no  mincost  arcs  and  }  is  a  minimum  cut  for  G^iyn),  it  fol¬ 

lows  from  the  results  in  [Ral]  that  Ny^jM-,  is  a  minimum  FAS  for  G,(v*),  where  Af,  is  any 
minimum  FAS  for  G,  (v,- ).[] 

Lemma  5.4  Let  lSJarfi<l2<  •  •  •  </,«/£/+ 1  be  any  sequence  of  indices.  Let  N*  be  a  minimum 

cut  in  Gi^' ,  *=1,  •  •  •  jr- 1.  Then  N ■  •  •  KjNr~HjMj  is  an  FAS  for  G,(v,-)t  where Mj 
is  an  FAS  for  Gf  (vy). 

Proof  By  induction  on  r . 

Base:  r= 2.  Suppose  ]<il.  AT)  is  a  minimum  cut  in  Gif .  Hence  N\  must  separate  Vj-\  from  vj  in 
Gif,  since  there  is  a  path  of  infinite  capacity  from  Vf  to  vy_i  end  from  vy  to  t  in  Gif.  Hence 
N ivj{<y)  is  a  cut  in  G^fo).  Hence  by  the  results  in  [Ral],  N\\jMj  is  an  FAS  for  G,(v,), 
where  Mj  is  any  FAS  for  G,(vy).  If  y«/+ 1  then  we  note  that  any  cut  in  Gii+X  is  an  FAS  for 
Gj(vj)aswelL 

Induction  step:  Assume  that  the  statement  of  the  lemma  is  true  for  all  sequences  of  indices  of 
length  r-r'-l  or  less,  and  let  rmf .  By  foe  base  case,  Nj^jA#,-,  is  an  FAS  for  G,(v,).  where  A#if  is 


any  FAS  for  G,(y^.  By  the  inductive  hypothesis,  Nz^j  •  •  •  KjNrKjMj  is  an  FAS  for  G,(viJ. 
Hence  iVi^j  •  •  •  \jNr  {jMj  is  an  FAS  for  G,(v,).[] 

Lemma  5J  Let  l£i=ii<iz<  •  •  •  <ir=j£l+l  be  a  sequence  of  indices  such  that  there  exists  a 
minimum  cut  in  G*^,^)  that  separates  v^j  from  ■■  p,  where  p=r-\  if  j£l  and 

par-2  if  j=4+l.  Let  Nt  be  a  minimum  cut  in  Gi**1  Jk*l,  -  -  -  ,r— 1.  Let  Mj  be  a  minimum  FAS 
for  G,  (yj ).  Then  N  n^j  •  •  •  is  a  minimum  FAS  forG,(v,). 

Proof  toy  induction  on  r . 

Base:  r*2.  Then  i=i\  and  j*i2.  If  jH  then  the  result  follows  from  Lemma  S.3.  If /W+l  then  no 
minimum  cut  in  G^,(v,)  separates  any  v*  from  v*_t,  i  <kSt.  Hence  any  minimum  cut  in  G‘J+l 
is  a  minimum  cut  forG^Cv;)  and  the  result  follows. 

Induction  step:  Assume  that  the  statement  of  the  lemma  is  true  for  all  sequences  of  indices  of 
length  r*/-l  or  less,  and  let  rW .  By  Lemma  5.3  is  a  minimum  FAS  for  G,(vi), 

where  Mi2  is  any  minimum  FAS  for  G,  (v,,).  By  the  inductive  hypothesis,  N2{j  ■  •  •  KjNr\jMj  is 
a  minimum  FAS  for  G, (v,,).  Hence  is  a  minimum  FAS  fbrG,(v,).[] 

Let  I'j  be  the  minimum  value  of  tire  sum  where 

IS!  <i2<  ■  ■  -  <ir-i<jZl+l,  and  the  indices  ij  and  their  number  r- 220  are  allowed  to  range  over 
all  peimissible  values.  Note  that 

Lemma  5.6  Let  l£im\<i2<  •  *  •  1  be  a  sequence  of  indices  such  that  there  exists  a 

minimum  cut  in  G**,^)  that  separates  v^j  from  v^*l.  ■  p,  where  p«r-l  if  JSl  and 

p«r-2  if  Then  the  cost  of  a  minimum  FAS  for  G,(v,)  is 

Proof  By  Lemma  5.4,  there  exists  an  FAS  in  G,(Vj)  with  cost  niJ* fnywM-  •  •  •  +nj~'J+mj  for  any 
sequence  of  indices  i  <j2<  ■  ■  •  <j,-\<j.  By  lemma  5 5,  a  minimum  FAS  in  G,(v,)  has  cost 
n,J*+n‘*it+ ;  ■  •  Hence  the  indices  i2,  -  -  4r-\  will  contribute  to  the  minimum  in  die 

expression  for  /•«*.  Thus  the  cost  of  a  minimum  FAS  fbr  G,(v,)  is  I'J-mj  .[J 

Corollary  im*liJ+l. 


Lamm  5.7  Let  A=|  (i+y)/2| . Then  /' ^^gjn^  /'■,f+»»*Jp+f*«f . 

Proof  An  easy  proof  shows  that  /•  ^£Jfc£j£n  /'•*+#»*>* +1***  *nd ,  sJi'x+nxJ+l>'j£li'i .[] 

The  characterization  in  Lemma  5.7  leads  to  the  following  implementation  of  the  Shrink  step 
in  the  parallel  algorithm  to  find  the  value  of  a  minimum  FAS  in  an  rfg. 


52  Parallel  FAS  Algorithm  for  SHRINK  Step 

Input  Graph  GM(v  0  with  its  associated  head  dominator  path  <v  i,v2, 


1. Portal, J  do 


Compute  l***1  by  finding  the  value  of  a  minimum  cut  in  G‘j+l . 

2.  For/*l,  •  •  • ,/ 

For  yw+l,  •  •  •  y+1  do 

Compute  ii'  J  by  finding  the  value  of  a  minimum  cut  in  G'J . 

3.  For  *-1,2.  •  •  ■  T k>g/|  do 
fori-1,-  •  •  J-2k 

for yi+2*-4U+2*-1+2,  •  •  4+2*  do 

Let  A-r(t+yy2l 

4.  For/*l,  ■  J  output  as  value  of  minimum  cut  forG^Cy,-). 


Let  GmmCv i)  have  r  vertices  and  s  arcs.  Step  1  requires  the  computation  of  ISr  minimum 
cuts  in  parallel  on  graphs  whose  total  size  is  0(r+j).  Hence  this  step  can  be  executed  by  a  ran¬ 
domized  algorithm  in  OQogpr)  time  with  0(j  ru)  processors  on  an  EREW  PRAM,  using  the 
algorithm  in  [MuVaVa].  Step  2  requires  foe  computation  of  0(/*)  minimum  cuts  and  in  the 
worse  case  this  requires  OQoffr)  time  using  O  (s  r5-3)  processors  for  a  randomized  algorithm  on 
an  EREW  PRAM.  The  inner  loop  of  step  3  (using  indices  i  and  j)  can  be  executed  in  constant 
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time  with  O (l'ty=0  (?*)  processors  and  hence  step  3  can  be  executed  in  GQogr)  time  with  O (r2) 
processors  by  a  deterministic  algorithm  on  an  EREW  PRAM.  Thus  the  complexity  of  the  Shrink 
step  is  dominated  by  step  2. 

The  FAS  algorithm  uses  the  above  Rake  and  Shrink  operations  logn  times.  Hence  it  is  a 
randomized  algorithm  that  runs  on  an  EREW  PRAM  in  0Qog3n)  time  using  0(m  n3-3)  proces¬ 
sors.  This  is  an  RNC  algorithm. 

At  this  point  we  have  the  value  of  a  minimum  cut  in  all  G^,(v ),  where  v  is  the  head  of  a 
back  arc  in  G.  Hence  we  can  construct  G,**  (v)  for  each  such  v  and  find  a  minimum  cut  in  each 
of  these  graphs  in  parallel  using  the  RNC  algorithm  of  [MuVaVa].  From  this  we  can  extract  a 
minimum  FAS  for  G  as  follows:  Place  a  pointer  from  each  mincost  arc  (u.v)  in  any  of  these 
minimum  cuts  to  the  minimum  cut  for  Gm(v).  Now  a  minimum  FAS  for  G  consists  of  tile  set 
of  arcs  in  G  that  are  in  some  minimum  cut  that  is  reachable  from  the  minimum  cut  for  G^(r)  in 
this  pointer  structure.  This  is  an  easy  NC  computation  and  thus  we  obtain  an  RNC  algorithm  to 
find  a  minimum  FAS  in  the  rfg  G . 

Finally  we  present  some  results  on  the  parallel  complexity  of  finding  feedback  sets  in 
weighted  ifg’s. 

Lemma  5.8  The  following  problems  are  reducible  to  one  another  through  NC  reductions. 

1)  Finding  a  minimum  FAS  in  an  unweighted  rfg. 

2)  Finding  a  minimum  weight  FAS  in  an  rfg  with  unary  weights  on  arcs. 

3)  Finding  a  minimum  weight  FVS  in  an  rfg  with  unary  weights  on  vertices. 

4)  Finding  a  minimum  cut  in  a  flow  network  with  capacities  in  unary. 

Proof:  Polynomial-time  reductions  between  between  1),  2),  and  3)  are  given  in  [Ral].  We  note 
that  all  of  these  reductions  are  NC  reductions  as  well.  We  show  that  4)  reduces  to  2):  We  use  the 
NC  reduction  in  [Ra2]  from  the  problem  of  finding  a  minimum  cut  in  a  general  flow  network  G 
to  the  problem  of  finding  a  minimum  cut  in  an  acyclic  flow  network  N .  Minimum  cut  for  N  can 
be  obtained  by  finding  a  minimum  weight  FAS  in  graph  G  derived  from  N  by  coalescing  source 


We  also  have  the  result  that  2)  reduces  to  4)  since  our  parallel  FAS  algorithm  uses  OQogn) 
applications  of  an  algorithm  for  4)  together  with  some  additional  NC  computation.  [] 

Lemma  SS  The  following  two  problems  are  P-compiete: 

1)  Finding  a  minimum  weight  FAS  in  an  rfg  with  arbitrary  weights  on  arcs. 

2)  Finding  a  minimum  weight  FVS  in  an  rfg  with  arbitrary  weights  on  vertices. 

Proof:  It  is  established  in  [Ra2]  that  finding  minimum  cut  in  acyclic  networks  is  P-complete.  Let 
G  be  an  acyclic  network  with  source  s  and  sink  r.  Let  Gf  be  formed  from  G  by  combining  s  and 
t  into  a  single  vertex  r .  Then  Gr  is  an  arc-weighted  rfg  rooted  at  r  and  a  minimum  weight  FAS 
inG  gives  a  minimum  cut  in  G .  Hence  part  1  of  the  lemma  follows. 

We  can  reduce  the  minimum  weight  FAS  problem  on  rfg’s  to  the  minimum  weight  FVS 
problem  mi  rfg’s  by  replacing  each  arc  in  the  arc-weighted  graph  G  by  two  arcs  (n  ,w)  and  (w  ,v) 
and  assigning  to  w  the  weight  of  arc  (u  ,v  ).  The  original  vertices  in  G  are  assigned  a  weight  n  W , 
where  W  is  the  maximum  weight  of  any  arc  in  G ,  and  n  is  the  number  of  vertices  in  G.  It  is 
easy  to  see  that  a  minimum-weight  FVS  in  die  new  graph  gives  back  a  minimum  weight  FAS  in 
G  having  the  same  weight  This  establishes  part  2  of  the  lemma.[] 
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